Notion to Docusaurus Migration
This guide documents the process of migrating documentation from Notion export to Docusaurus, including the scripts used to automate structure organization, frontmatter addition, and syntax fixing.
Overview
The migration process involves 4 main steps:
- Export from Notion: Export pages as Markdown & CSV.
- Reorganize Structure: Map flat Notion export to Docusaurus folder hierarchy.
- Add Frontmatter: Inject Docusaurus metadata (ID, title, sidebar position).
- Fix MDX Syntax: Resolve compatibility issues (angle brackets, table formatting).
Step 1: Notion Export
Export your Notion workspace or page with the following settings:
- Export format: Markdown & CSV
- Include subpages: Yes
- Create folders for subpages: Yes
Step 2: Reorganize Structure
Use the reorganize_wpcli_structure.py script to move files into a clean hierarchy and create _category_.json files.
Script: reorganize_wpcli_structure.py
This script defines a STRUCTURE list mapping folders to files and moves them accordingly.
#!/usr/bin/env python3
"""
Reorganize WP-CLI docs into hierarchical folder structure with _category_.json files
"""
import os
import json
import shutil
BASE_DIR = "/opt/docker-data/apps/docusaurus/site/docs/wordpress/wp-cli"
# Example Structure (truncated)
STRUCTURE = [
{
"folder": "01-introduction",
"label": "1. Introduction",
"position": 1,
"files": ["what-is-wp-cli.md", "installation-set-up.md"]
},
# ... other folders
]
def reorganize_docs():
for module in STRUCTURE:
folder_path = os.path.join(BASE_DIR, module["folder"])
os.makedirs(folder_path, exist_ok=True)
# Create _category_.json
category_data = {
"label": module["label"],
"position": module["position"],
"link": {"type": "generated-index"}
}
with open(os.path.join(folder_path, "_category_.json"), 'w') as f:
json.dump(category_data, f, indent=4)
# Move files...
Step 3: Add Frontmatter
Use add_wpcli_frontmatter.py to add necessary Docusaurus headers.
Script: add_wpcli_frontmatter.py
#!/usr/bin/env python3
"""
Add proper frontmatter to all WP-CLI documentation files
"""
import os
BASE_DIR = "/opt/docker-data/apps/docusaurus/site/docs/wordpress/wp-cli"
FRONTMATTER_MAP = {
"01-introduction": [
{"file": "what-is-wp-cli.md", "id": "what-is-wp-cli", "title": "What is WP-CLI?", "label": "What is WP-CLI?", "pos": 1},
# ...
]
}
def create_frontmatter(id_val, title, label, position):
return f"""---
id: {id_val}
title: {title}
sidebar_label: {label}
sidebar_position: {position}
---
"""
Step 4: Fix MDX Syntax Errors
Notion exports often contain characters that break Docusaurus MDX parsing, specifically:
- Angle brackets
<...>interpreted as HTML/JSX tags. - Pipe characters
|inside table cells breaking table structure.
Use fix_mdx_universal.py to robustly escape these characters.
Script: fix_mdx_universal.py
import os
# Known valid HTML tags to ignore (rendering as HTML is fine)
HTML_TAGS = {
'br', 'hr', 'img', 'a', 'b', 'code', 'table', 'tr', 'td', 'th', 'div', 'span',
# ... complete list
}
def fix_file(filepath):
with open(filepath, 'r', encoding='utf-8') as f:
content = f.read()
lines = content.split('\n')
new_lines = []
in_code_block = False
for line in lines:
if line.strip().startswith('```'):
in_code_block = not in_code_block
new_lines.append(line)
continue
if in_code_block:
new_lines.append(line)
continue
# Logic to escape <tags> and invalid pipes |
# ... (Refer to full script in /home/rezriz/github/scripts/fix_mdx_universal.py)
# Key fix: escapes <placeholder> to `<placeholder>`
# Key fix: escapes | in tables to \|
Execution Log Example
For a real migration execution record (source path, destination path, structure mapping, cleanup rules, and validation checks), see: